Linking Archives Using Document Enrichment and Term Selection

نویسندگان

  • Marc Bron
  • Bouke Huurnink
  • Maarten de Rijke
چکیده

News, multimedia and cultural heritage archives are increasingly offering opportunities to create connections between their collections. We consider the task of linking archives: connecting an item in one archive to one or more items in other, often complementary archives. We focus on a specific instance of the task: linking items with a rich textual representation in a news archive to items with sparse annotations in a multimedia archive, where items should be linked if they describe the same or a related event. We find that the difference in textual richness of annotations presents a challenge and investigate two approaches: (i) to enrich sparsely annotated items with textually rich content; and (ii) to reduce rich news archive items using term selection. We demonstrate the positive impact of both approaches on linking to same events and linking to related events.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Influence of Different Culture Selection Methods on Polyhydroxyalkanoate Production at Short-term Biomass Enrichment

In this study, the potential of four different culture selection methods under short-term enrichment time (STE) to accumulate PHA-producing bacteria in mixed activated sludge was compared and the most efficient culture selection method was introduced. This means, PHA-producing microbial community was firstly enriched in a sequencing batch bioreactor (SBR) with four different selection methods i...

متن کامل

Semantic enrichment for recommendation of primary studies in a systematic literature review

A Systematic Literature Review (SLR) identifies, evaluates, and synthesizes the literature available for a given topic. This generally requires a significant human workload and has subjectivity bias that could affect the results of such a review. Automated document classification can be a valuable tool for recommending the selection of studies. In this article, we propose an automated pre-selec...

متن کامل

TermPedia for Interactive Document Enrichment: Using Technical Terms (TT) to Provide Relevant Contextual Information

Technical Terms (TTs) and/or jargon embedded within technical documents can make it difficult or impossible to understand a document. This is why we would like to investigate a possibility of providing information for the TTs by linking them to relevant lexicon or encyclopedia pages. In this way, additional contextual information relating to the TTs shall be readily available and hopefully make...

متن کامل

Case Studies in Ontology-Driven Document Enrichment

In this paper we present an approach to document enrichment, which consists of associating formal knowledge models to archives of documents, to provide intelligent knowledge retrieval and (possibly) additional knowledge services, beyond what is available using 'standard' information retrieval and search facilities. The approach is ontology-driven, in the sense that the construction of the knowl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011